Defining Classifier Regions for WSD Ensembles Using Word Space Features

نویسندگان

  • Harri M. T. Saarikoski
  • Steve Legrand
  • Alexander F. Gelbukh
چکیده

Based on recent evaluation of word sense disambiguation (WSD) systems [10], disambiguation methods have reached a standstill. In [10] we showed that it is possible to predict the best system for target word using word features and that using this 'optimal ensembling method' more accurate WSD ensembles can be built (3-5% over Senseval state of the art systems with the same amount of possible potential remaining). In the interest of developing if more accurate ensembles, w e here define the strong regions for three popular and effective classifiers used for WSD task (Naive Bayes NB, Support Vector Machine SVM, Decision Rules D) using word features (word grain, amount of positive and negative training examples, dominant sense ratio). We also discuss the effect of remaining factors (feature-based).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UBC-ALM: Combining k-NN with SVD for WSD

This work describes the University of the Basque Country system (UBC-ALM) for lexical sample and all-words WSD subtasks of SemEval-2007 task 17, where it performed in the second and fifth positions respectively. The system is based on a combination of k-Nearest Neighbor classifiers, with each classifier learning from a distinct set of features: local features (syntactic, collocations features),...

متن کامل

GAMBL, genetic algorithm optimization of memory-based WSD

GAMBL is a word expert approach to WSD in which each word expert is trained using memorybased learning. Joint feature selection and algorithm parameter optimization are achieved with a genetic algorithm (GA). We use a cascaded classifier approach in which the GA optimizes local context features and the output of a separate keyword classifier (rather than also optimizing the keyword features tog...

متن کامل

Integrating Collocation Features in Chinese Word Sense Disambiguation

The selection of features is critical in providing discriminative information for classifiers in Word Sense Disambiguation (WSD). Uninformative features will degrade the performance of classifiers. Based on the strong evidence that an ambiguous word expresses a unique sense in a given collocation, this paper reports our experiments on automatic WSD using collocation as local features based on t...

متن کامل

High WSD Accuracy Using Naive Bayesian Classifier with Rich Features

Word Sense Disambiguation (WSD) is the task of choosing the right sense of an ambiguous word given a context. Using Naive Bayesian (NB) classifiers is known as one of the best methods for supervised approaches for WSD (Mooney, 1996; Pedersen, 2000), and this model usually uses only a topic context represented by unordered words in a large context. In this paper, we show that by adding more rich...

متن کامل

Addressing the MFS Bias in WSD systems

Word Sense Disambiguation (WSD) systems tend to have a strong bias towards assigning the Most Frequent Sense (MFS), which results in high performance on the MFS but in a very low performance on the less frequent senses. We addressed the MFS bias in WSD systems by combining the output from a WSD system with a set of mostly static features to create a MFS classifier to decide when to and not to c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006